> 



Congestion, equilibrium and learning: 
The minority game* 



o 

o . 

^ ■ Willemien Kets^ Mark Voorneveld* 

Jj2 : August 2007 

i-C ' Abstract 

O ' The minority game is a simple congestion game in which the players' main goal 

is to choose among two options the one that is adopted by the smallest number 
of players. We characterize the set of Nash equilibria and the limiting behavior of 
several well-known learning processes in the minority game with an arbitrary odd 
number of players. Interestingly, different learning processes provide considerably 
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1 Introduction 



Congestion games are ubiquitous in economics. In a congestion game (iRosenthall . 



19731 ). players use several facilities from a common pool. The costs or benefits that a 



player derives from a facility depends on the number of users of that facility. A conges- 
tion game is therefore a natural game to model scarcity of co mmon resources. Examples 



of such systems include vehic ular traffic (INagel et al. 



Huberman and Lukose. 



19971 ). and ecologies of foraging animals (jDeAnge 



19971 ). pac ket traffic in networks 
is and Gross. 



1992). Similar coordination problems are encountered in market entry games (ISelten and Girth! . 
19821 ). 

Congestion games are also interesting from a theoretical point of view. In congestion 
games, players need to coordinate to differentiate. This seems to be more difficult than 
coordinating on the same action, as any commonality of expectations is broken up. For 
instance, when commuters have to choose between two roads A and B and all believe that 
the others will choose road A, nobody will choose that road, invalidating beliefs. The 
sorting of players predicted in the pure-strategy Nash equilibria of such games violates 
the common belief that in symmetric games, all rational players will evaluate the situation 
ident ically, and hence, make the same choices in similar situations (see lHarsanyi and Seltenl . 



19881 . p. 73). Moreover, in congestion games, players may obtain asymmetric payoffs in 
equilibrium which may complicate attainment of equilibrium, as coo rdination cannot b e 
achieved through tacit coordination based on historical precedent (cf. Meyer et all Il992l ). 
Finally, congestion games often have many equilibria, so that players also face the difficulty 
of coordinating on the same equilibrium. 

Therefore, it is an interesting question what type of behavior game theory predicts 
in such games. In this paper, we characterize the equilibria o f the m i nority game, a 
simple congestion game based on the El Farol bar problem of lArthurl (119941 ). and we 
study the limiting behavior of a number of well-known learning processes for this game. 
In the minority game, an odd number of players — to make minorities well-defined - 
choose between two ceteris paribus identical alternatives. Congestion is costly, so players 
prefer the alternative chosen by the smallest number of players. The minority game is 
thus closely related to the m arket e ntry g ame, a game extensively studied in experimental 
eco nomics (see the survey of Ochsl (119991 ) and references therein; for a recent contribution 
see iDuffy and Hopkins! (120051 )). While the market entry game models situations in which 
players can choose between a safe option (staying out of the market) and an alternative 
whose payoffs are declining in the number of other players choosing that option (entering), 
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the minority game is a suitable model for more symmetric situations in which the payoffs 
of both actions depend on the number of other players choosing that action. In such 
situations, players will need to outsmart other players, so as to be one step ahead of their 
opponents. For instance, the minority game may be a good model for financial markets, 
where investors try to identify the underpriced shares, and try to sell the shares they expect 
to fall in t he future. The min ority game has been studied by a number of authors in 



economics. 



Renault et al. 



2005) stu dies repeated p l ay in the game. iBottazzi and Devetag 



fl2007f ). IChmura and Pitzl (l20o"3 i. and iHelbing et al.l (120051 ) study the game e xperimentally. 
'he g am e has a. 



fl2004h or 



Coolen 



Challet et al- 



s o bee n studied extensively in the physics literature; see 
(120051 ) for an overview. 
Interestingly, we find that the predictions from different learning processes are not 
equivocal. While the replicator dynamic predicts that play converges to a Nash equilibrium 
with at most one player who chooses a strictly mixed strategy, the set of stationary points 
under the pe rturbed best-response dynam ics consists of the logit quantal response equilibria 
of the game jMcKelvev and Palfrevl . TlQQsh Fl For the case of three players, we show that the 



set of Nash equilibria that are the limit of a sequence of logit quantal response equilibria 
with vanishing noise consists of Nash equilibria with at most one mixer and the Nash 
equilibrium in which all players randomize equally over th eir two ac t ions. Finally, we study 
the best- reply learning process with li mited m emory of iHurkensl (119951 ) and the related 
model of iKets and Voorneveldl (120051 ) . IHurkensl studies a learning model in which players 



choose an arbitrary action that is a best reply to some belief over oth er players' actions that 
is consistent with their recent past play. In the learning model of IKets and Voorneveldl . 
players also best-reply to beliefs over others' play supported by recent past play, but in 
addition, players additionally display a so-called recency bias: when there are multiple 
best replies to a given belief, a player ch ooses the action that he most recently played. We 
show that wh ile the process of Hurkensl offers no sharp predictions for the minority game, 
the model of IKets and Voorneveldl predicts that play converges to one of the pure Nash 



equilibria of the game when players have a memory length of at least two periods. 

The current paper is related to the lite rature on learning in congestion games and more gen 



erally learning in poten t ial ga mes (e.g. lHofbauer and Hopkina . l2005l : lHofbauer and Sandholml . 



2002 



Sandholm. £001 



2007). P apers that study lea rning in games similar to the gam e 



considered here include 



Blonskil (119991 ). iFrankel (120031 ) and iKoiima and Takahashil (J200J). 
Most of these papers focus on the predictions of a single learning modelj^l while we compare 



For a definition of 



earning prncessps, spp Spctio 



Duffy and Hopkins! (|2005l ) and lKoiima and Takahashil (|2004r ) are notable exceptions. 



HI and [4] respectively. 
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predictions from different learning models. Moreover, while most results are obtained for 
games with either a small number or a continuum of players, we characterize the equilibria 
of the game and the limiting behavior of different learning processes for any (odd) number 
of players. 

The outline of this paper is as follows. In Section (2], we define the game and characterize 
its Nash equilibria. In Section [3l we characterize the set of stationary states and the set 
of asymptotically stable states under the replicator dynamic. In Section HJ we characterize 
the set of stationary states under the perturbed best-response dynamics. In Section [5j 
we characterize the limiting behavior in the minority game under the best-reply learning 
processes with limited memory. Section [6] concludes. 



2 The minority game 



2.1 Basic definitions 



Following the notation of iTercieux and Voorneveldl (120051 ). we denote the set of players 
by N = {1, ... ,2k + 1}, with k G N. Each player i G N has a set of pure strategies 
Ai = { — 1, +1}: agents have to choose between two options. The set of mixed strategies of 
player i is denoted by A(Ai). We denote a mixed strategy profile by a G x ieN A(Ai), and 
we use the standard notation a_j G Xj £N \^A(Aj) to denote a strategy profile of players 
other than i G N. With each action a G {— 1, +1}, a function 

f a : {l,...,2fc + l} 

can be associated which indicates for each n G {1, . . . , 2k + 1} the payoff f a (ri) to a player 
choosing a when the total number of players choosing a equals n. The von Neumann- 
Morgenstern utility function of a player is then given by 



Ui(a) = f^ E N : aj = Oi}|) 



;2.1) 



Challet et al 



20041 1 that congestion is costly: 



where a G Xj £N Aj. Payoffs are extended to mixed strategies in the usual way. 

The functio n /„,(•), a G {— can have several forms. We make the common as- 
sumptions (e.g. 

[Mon] /_i and f + i are strictly decreasing functions, 

and that the congestion effect is the same across alternatives: 
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[Sym] /_! = f +1 . 



We refer to a player who uses a mixed strategy that puts positive probability on both pure 
strategies a mixer. A player that puts full probability mass on the alternative —1 is called 
a (— l)-player; similarly, a player that puts full probability mass on the alternative +1 is 
called a (+1) -player. 



2.2 Nash equilibria 



Throughout this section, let k G N and consider a minority game with 2k + 1 players. 
We characterize its set of Nash equilibria. The pure Nash equilibria are easy to characterize: 



Proposition 2.1. jTercieux and Voorneveldl (120051 )] A pure strategy profile is a Nash 
equilibrium if and only if one of the alternatives —1 or +1 is chosen by exactly k of the 
2k + 1 players. 

It remains to characterize the game's Nash equilibria with at least one mixer. 

Lemma 2.2. Let a G x ieN A(Ai) be a Nash equilibrium with a nonempty set of mixers. 
All mixers use the same strategy: for all i,j G iV ; if a%, atj {(1, 0), (0, 1)}, then ctj = a 



Proof. By [Sym], the 2x2 subgame played by two mixers % (row player) and j (column 
player) given the strategy profile of the remaining players is of the form 



+1 





y,z 


z,y 


w, w 



where, for instance, y is the payoff to the player choosing —1 if the other player chooses 
+1 and the remaining players stick to the mixed strategy profile (ak)k€N\{ij}- By [Mon], 
a player is better off if the other chooses differently, i.e., x < y and z > w. Let p,q G (0, 1) 
denote the equilibrium probability with which player i and j, respectively, choose —1. In 
equilibrium, each player must be indifferent between playing +1 and playing —1: 

px + (1 — p)y = pz + (l—p)w, 
qx + (1 — q)y = qz + (1 — q)w. 

Subtracting the latter expression from the former yields 

(p - q)(x - y) = (p - q)(z - w). 



5 



As x < y and z > w, this can only hold if p — q. Since mixers % and j were chosen 
arbitrarily from the set of mixers, this implies that all mixers use the same strategy. □ 

Since all mixers use the same strategy and player labels are irrelevant by [Sym] (if a 
is a Nash equilibrium, so is every permutation of a), a non-pure Nash equilibrium can 
be summarized by its type (£, r, A), where £,r G {0,1, ... ,2k + 1} denote the number of 
players choosing pure strategy —1 or +1, respectively, and A G (0, 1) the probability with 
which the remaining m(£,r,X) := (2k + 1) — (£ + r) > mixers choose —1. Moreover, 
let V-i(£,r,X) denote the expected payoff to a player choosing — 1; v + i(£,r,X) is defined 
similarly. For convenience, write m := m(£,r,X). Letting one of the mixers in (£,r,X) 
deviate to a pure strategy, this implies in particular that 

Xy-^'f-iV+l + s), (2.2) 

A )m-l- a/+i((r + 1) + (m _ 1 _ s)) 
Xr -l-s f+i{r + m _ s) (23) 

For instance, a profile of type (£ + 1, r, A) is obtained from type (£, r, A) if a mixer switches 
to pure strategy — 1. In that case, there are m — 1 mixers left. To obtain expected 
payoffs, notice that the probability that s G {0, ... ,m — 1} of these mixers choose —1 is 
( m ~ 1 )A s (l — A) m ~ 1_s . Using this notation, the Nash equilibria with at least one mixer are 
characterized as follows. 

Proposition 2.3. 

(a) (Characterization of equilibrium) Let £,r G {0, 1, . . . , 2k + 1} be such that £ + r < 

2k + 1. Let A G (0, 1). A strategy profile of type (£, r, A) is a Nash equilibrium if and 
only if 

v_ 1 (£ + l,r,X)=v +1 (£,r+l,X). (2.4) 

(b) (Equilibria with one mixer) There exist equilibria with exactly one mixer. These 

equilibria are of type (k,k,X) with arbitrary A G (0,1), i.e., the mixer uses an ar- 
bitrary mixed strategy, whereas the remaining 2k players are spread evenly over the 
two pure strategies. 



u_i(* + l,r,A) 
v +1 (£,r + l,X) 



m-1 , , 

E( m ;Va 

s=0 ^ 
m— 1 / 

E( m ;V<i 

s=0 ^ 

m-1 , , 
s=0 ^ 
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(c) (Equilibria with more than one mixer) Let £, r G {0, 1, . . . , 2k + 1} be such that 
£+r < 2k— 1. There is a Nash equilibrium of type (£, r, A) if and only ifm&x{£, r} < k. 
The corresponding probability A G (0, 1) solving (12.41) is unique. 

Proof, (a): Condition (12.41) says that a mixer is indifferent between choosing —1, thereby 
raising I to £ + 1 and obtaining payoff V-\[£ + 1, r, A), or choosing +1, thereby raising r to 
r + f and obtaining payoff r + 1, A). Hence, (12.41) is a necessary condition for Nash 

equilibrium. 

To establish sufficiency, it remains to show that also players using a pure strategy — if 
there are such players, i.e., if £ + r > 1 — choose a best reply. Suppose £ > 1. The payoff 
to a (— l)-player is v_i(£, r, A), while a unilateral deviation to +1 yields v + %(£ — 1, r + 1, A). 
However: 

v-i(£,r,X) > v^(£ + l,r,X) 
= v +1 (£,r + l,X) 
> v +1 (£-l,r + l,X). 

Inequality (12.51) uses [Mon] : conditioning on the behavior of one of the m 
mixers, write 

r, A) = Xv^(£ + 1, r, A) + (1 - X)v^(£, r + 1, A). 

Then 

v^(£, r, A) - v-tf + 1, r, A) = (1 - A) [v^(£, r + 1, A) - v^(£ + 1, r, A)] 

m-l , _ 1 \ 

= C 1 - A ) E 1 AS ( X - A ) m_1 " + - + 1 + «)] 

> 

by [Mon]. Inequality (12.71) follows similarly and (12.61) is simply condition (12.41) . So if £ > 1, 
(— l)-players choose a best reply. Similarly, if r > 1, (+l)-players choose a best reply, 
(b): Let A G (0,1). Substitution in (12. 4ft and [Sym] yield that strategy profiles of type 
(k, k, A) are Nash equilibria: 

v-!(k + 1, k, A) = f^(k + 1) = f +1 (k + 1) = v+i(k, k + 1, A). 

Conversely, consider a Nash equilibrium of type (£, r, A) with exactly one mixer: £ + r = 2k. 
We establish that £ = r. Suppose not. W.l.o.g., £ > r. Since £ + r = 2k, this implies 
£ > k + 1 and r < k — 1. The expected payoff to a (— l)-player is 

A/_i(*+l) + (l-A)/_i(4 



(2.5) 
(2.6) 
(2.7) 

:= m(£,r, A) > 
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while deviating to +1 would yield 



A/ +1 (r + l) + (l-A)/+i(r + 2). 



Since £ + 1 > r + 1,£ > r + 2, and A G (0, 1), it follows from [Sym] and [Mon] that a 
(— l)-player would benefit from unilateral deviation, contradicting the assumption that the 
profile of type (£, r, A) is a Nash equilibrium. Conclude that £ = r. 

(c): Without loss of generality, i > r, so max{£, r} = £. Let m = (2k + 1) — [I + r) > 2 
be the number of mixers. By substitution, £ < k if and only if £ + 1 < r + m. To prove 
(c), it therefore remains to establish three things. 

Firstly, if £ + 1 < r + m, there is a A G (0, 1) solving (12. 4ft . To see this, use £ > r to 
find that £ + m > r + 1. By [Sym] and [Mon], it follows that 



By the Intermediate Value Theorem applied to v~i(£ + 1, r, •) — f+i(^, r + 1, •), there is a 
A G (0, 1) solving (12. 4ft : there is a Nash equilibrium of type (£, r, A). 

Secondly, this A G (0,1) solving (12.41) is unique. By (12. 2p . V-i(£ + l,r, •) is the ex- 
pectation of a strictly decreasing function of a binomial stochastic variable. By stochastic 
dominance (see Appendix lAl) . this makes V-%(£ + l,r, •), the left-hand side of (I2.4p . strictly 
decreasing in A. Similarly, by (12. 3p . the right-hand side of (12.41) is strictly increasing in 
A. Conclude that the functions V-\(£ + l,r, •) and v+\(£, r + 1, •) intersect at most once. 
By the previous step, as long as £ + 1 < r + m, they intersect at least once, establishing 
uniqueness. 

Thirdly, if £ + 1 > r + m, there is no A G (0, 1) solving (12.41) . To see this, notice that 
the inequality implies 

£ + m>--->£ + 2>£ + l>r + m>r + m — 1 > • • • > r + 1, 

so by [Sym] and [Mon]: 



f-i(£+m) <■■■< f-i(£+2) < < Ux{r+m) < / +1 (r+m-l) < ■ ■ ■ < /+i(r+l). 

Substitution in ( 12.21) and (12.31) yields that 



w_i(*+l,r,0) 
u_i(*+l,r,l) 



/_!(* + 1) > /+i(r + m) 
/_x(£ + m) < /+i(r + l) 



« +1 (^r + l,0), 
u +1 (^r+l,l). 



v +1 (£,r+l,X) > v^(£+l,r,X) 



for all A G (0, 1): there is no solution to (12. 4p . 



□ 
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Some consequences of this characterization of the game's non-pure Nash equilibria: 

(i) : There are no Nash equilibria where the number of mixers is two, since in that case, 
max{£, r} > k. 

(ii) : Substitution in (\2A\i gives that a strategy profile in which the number of (— l)-players 
is equal to the number of (+l)-players and the remaining players mix with probability 1/2, 
i.e., a profile of type (t, t, 1/2) with t E {0, . . . , k}, is a Nash equilibrium. 

Having characterized the set of Nash equilibria, we now establish that the set of Nash 
equilibria with at most one mixer is connected. 

Proposition 2.4. The set of Nash equilibria with at most one mixer is connected. 

Proof. In a Nash equilibrium with exactly one mixer, the completely mixed strategy is 
arbitrary. Letting the probability go to zero or one, this line piece of Nash equilibria in the 
strategy space has a pure Nash equilibrium as its end point. Hence, to show connectedness, 
it suffices to show that for each pair of pure Nash equilibria, there is a chain of pure Nash 
equilibria differing in exactly one coordinate connecting them. 

So let x and y be distinct pure Nash equilibria. By Proposition EHJ the majority action, 
i.e., the action chosen by exactly k + 1 players in a given Nash equilibrium, is well-defined. 
We need to consider two cases. Firstly, if this action is the same in x and y, w.l.o.g. — 1, 
then x 7^ y implies that the (k + l)-player majorities in x and y must be distinct. Let i 
be such a majority player, choosing —1 in x, but +1 in y. Secondly, if the majority action 
is different in x and y, w.l.o.g. —1 in x and +1 in y, then by definition of a majority, 
the (k + l)-player majorities in x and y have a nonempty intersection. Again, let i be a 
majority player choosing —1 in x, but +1 in y. 

By construction, as i is a majority player, the path of Nash equilibria in which i 
increases the probability of playing the action +1 from to 1 connects x to another pure 
Nash equilibrium x* with Xi ^ x* = yi and x* = yj for all j ^ i, i.e., with a strictly smaller 
Hamming distance to y (recall that the Hamming distance between two finite-dimensional 
vectors is the number of coordinates in which they differ). 

As the strategy vectors have only finitely many coordinates and we can reduce the 
Hamming distance between pure Nash equilibria by the procedure above, the result now 
follows by induction. □ 
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3 The replicator dynamic 

In this section, we study the replicator dynamic (e.g. Weibulj Il995l ) for the minority 



game. There is a set N = {1, . . . , 2k + 1} of populations, where each population is the unit 
interval [0, 1]. The populations represent the 2k + 1 player positions in the minority game. 
All agents in a population are initially programmed to some pure strategy. Hence, each 
population can be divided into two subpopulations (one of which may contain no agents), 
one for each of the pure strategies in the minority game. A population state is a vector 
a = (ai, . . . ,«2A:+i) in the polyhedron of mixed-strategy profiles, where for each i G N, 
cti is a point in the simplex A(Ai), representing the distribution of agents in population 
i across the different pure strategies. The vector «j G A(A,j) thus represents the state of 
population i, with aj(a,) denoting the proportion of agents programmed to play the pure 
strategy a; G Aj. 

Time is continuous and indexed by t. Agents - one from each population - are con- 
tinuously drawn uniformly at random from these populations to play the minority game. 
Suppose payoffs represent the effect of playing the game on an agent's fitness, measured as 
the number of offspring per time unit, and that each offspring inherits its single parent's 
strategy. This gives rise to the following dynamics for the population shares: 

Vi G N, Vaj G Ai : d^a;) = afj (a*, - Ui(a h a_i)). (3.1) 

This system of differential equations defines the (continuous time multipopulation) repli- 
cator dynamic. In words, the growth rate (^(a^/a^aj) of a pure strategy G Ai in 
population i G iV is equal to the difference in payoffs of the pure strategy and the cur- 
rent average payoffs for the population. Hence, the population shares of strategies that 
do better than average will grow, while the shares of the other strategies will decline. It 
is easily seen that the subpopulations associated with the pure best replies to the current 
population state have the highest growth rates. 

The system of differential equations (13. ip defines a continuous solution mapping £ : 
K x ( x ieivA(Ai)) — > XjgjvA(Aj) which assigns to each time i e R and each initial state 
a G x i€N A(Ai) the population state a ) G x ieN A(Ai). The (solution) trajectory 
through a population state a G x i( z N A(Ai) is the graph of the solution mapping £(-,a°). 

A population state a G x i( z^A(Ai) is a stationary state of the replicator dynamics ( 13. lft 
if and only if for each population i E N every pure strategy a« G Ai that is used by some 
agents in the population gives the same payoffs. In that case, &i{aj) = for alii E N and all 
di G Ai. Let S = {a G Xj £N A(Aj) \ Vz G N,Wai G Ai : di(a,i) = 0} be the set of stationary 
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states. By definition, if a G S, then a player % G N either uses a pure strategy or — if he is 
a mixer — is indifferent between his two pure strategies: u^ia^ = Wj(aij, a_i) for both 
<2i G -Aj . Using the proof of Lemma 12. 2[ all mixers must use the same strategy. If there 
is more than one mixer, the proof of Proposition 12.3( c) indicates that this mixed strategy 
solving (12.41) is uniquely determined by the number of players choosing pure strategy — 1 
and pure strategy +1. Conclude that the set of stationary states can be partitioned into 
three subsets: 

5*i : The connected set of Nash equilibria with at most one mixer; 

and a finite collection of isolated stationary states, namely 

5*2: Nash equilibria with more than one mixer; 

S3: nonequilibrium profiles of some type (£, r, A), where 

£,r G {0,...,2fc + l}, 
£ + r<2k+l, 

if £ + r < 2k + 1, then A G (0, 1) uniquely determined by (12.41) . 

It remains to study the stability properties of these stationary states. We consider Lya- 
punov stability and asymptotic stability. Roughly speaking, a population state is Lyapunov 
stable if no small change in the population shares can lead the replicator dynamics away 
from the population state, while a population state is asymptotically stable if it is Lyapunov 
stable and any sufficiently small change in the population shares results in a movement 
back to the original population state. Formally, a population state a G x ieN A(Ai) is 
Lyapunov stable if every neighborhood B of a contains a neighborhood B° of a such that 
£(t, a ) G B for every x° £ B C\ x ie iyA(Ai) and t > 0. It is asymptotically stable if it is 
Lyapunov stable, and, in addition, there exists a neighborhood B* such that 

lim a ) = a 

t^oo 

for each initial state a G B* n x ieN A(Ai). 

The analysis relies heavily on t he existence of a Lyapunov fun ction for the replica- 



tor dynamic in the minority game. iTercieux and Voorneveldl (120051 ). using Thm. 3.1 in 



Monderer and Shapleyl (119961 ). show that a minority game is a (finite exact) potential game: 
there exists a real-valued (so-called potential) function U on the pure strategy space such 
that for each i G N, each a_j G Xj EN \u\Aj, and all a.j, 6j G Af. 

Ui(a,i, a-i) - Ui(bi, a_j) = U (a i; a_;) - U (k, a_j). (3.2) 
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Taking expectations, (13. 2p can be extended to mixed strategies, so the payoff difference in 
(13. f P equals the corresponding change in the potential. Hence, the replicator dynamic can 
be rewritten as: 



Mi G N, Vcij G Ai : di(ai) = ai(ai)(U(ai,a-.i) -U(ai,a-i)). (3.3) 
This makes the potential U a Lyapunov function of the replicator dynamic. More precisely: 

Proposition 3.1. The potential function U of the minority game is a strict Lyapunov func- 
tion for the replicator dynamic: for each solution trajectory (a(t))ie[o,oo)j dU(a(t))/dt > 
with equality exactly in the stationary states. 

Proof. Suppressing time indices for ease of notation, direct calculation gives 
U(a) = ^2^2 U{ai,a„i)di(ai) 

= 22 ^2 a i( a i)(U(ai,a-i) - U(ai,a-i))U(ai, a-i) 
= 22 ^2 ( a i( a i)U(ai,a-i) 2 - U(ai,a„i) 2 ) 



J2 [Uia^a^f] - (E Qi [U(a t , a. t )]) 2 ) 
^ Vax ai U(a i ,a- i ) 



i£N 

> o, 

with equality if and only if all variances are zero, i.e., if and only if a is a stationary point 
of the replicator dynamics. □ 

Proposition 3.2. The collection of Nash equilibria with at most one mixer in Si is asymp- 
totically stable under the replicator dynamic. Stationary states in S2 and S3 are not Lya- 
punov stable. 

Proof. To see that the collection of Nash equilibria in S± is asymptotically stable, notice 
that Si is the set of global maxima of U: The potential U in (13.21) was extended to mixed 
strategies by taking expectations, so U achieves a global maximum in a pure strategy 
profile which, again by (13. 2p . is a pure Nash equilibrium. By symmetry, all pure Nash 
equilibria are global maxima of U and so are equilibria with exactly one mixer. Other 
strategy profiles are not global maxima of U : they are not Nash equilibria or, if they are, 
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they involve more than one mixer, in which case they put positive probability also on pure 
strategy profiles that are not Nash equilibria and consequently not global maxima of U. 
This connected set of global maxima of the Lyapunov function U is asymptotically stable 
iweibulll . \mk Thm. 6.4). 



We show that elements of S2 are not Lyapunov stable; the case for points in 5*3 is 
similar. Let a* G S2, i.e, a* is a Nash equilibrium with more than one mixer. Suppose 
it is Lyapunov stable. Since it is an isolated point of the collection of stationary states, 
there is a neighborhood B of a* whose closure contains only the stationary state a*: 
cl(.B) H 5*2 = {«*}. By Lyapunov stability, as long as the initial state a(0) lies in a 
sufficiently small neighborhood B' of a*, the entire solution trajectory (a(t)) te [o j0 o) remains 
in B. 

Let i G N be one of the mixers in the Nash equilibrium a*. Since % is indifferent between 
his two pure strategies and the potential U measures payoff differences, it follows that 

U(a*) = U(-l,a*_ i ) = U(+l,a*_ i ). 

Consequently, {7(7$, a£.J = U(a*) for all mixed strategies 73 of player i. For 7; 7^ a* suf- 
ficiently close to a*, it follows that (7*, alj G B'. Hence, the entire solution trajectory 
(l(t))te[o,oo) with 7(0) := (7*, remains in B. Since its starting point is not stationary, 
Proposition 13.11 implies that the Lyapunov function U strictly increases along the trajec- 
tory, until it may reach a stationary state. Let 7* G Xj e ^A(Aj) be a limit point of the 
trajectory (7(t))t e [o,oo) : there is a strictly increasing sequence of time points t m — > 00 
such that linLm ^nn 7(t m ) — » 7*. S uch a limit point exists and has to be a stationary point 



(Lemma A.l of ISandholml . l200ll . p. 104). Since cl(B) fl S2 = {a*} and the trajectory lies 
in B, it follows that 7* = a*. But then \xm m ^ 00 U( / y(t m )) = U(a*) = {7(7(0)), contra- 
dicting that the Lyapunov function is increasing along the trajectory. Conclude that a* 
cannot be Lyapunov stable. For a* G S3, proceed similarly. As it is not a NE, some % can 
profitably deviate slightly (to remain inside B'), so the remaining trajectory must increase 
the potential, but still have a* as its limit point. □ 
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4 Perturbed best-response dynamics and quantal re- 
sponse equilibria 



4.1 Perturbed best-response dynamics 



U nder stochastic fictitious play (e.g. lHofbauer and Hopkins 



2002 



2005 



Hofbauer and Sandholml . 



Hopkinsl . |2002| ) . players repeatedly play a normal form game (in discrete time). They 



choose best replies to their beliefs on other players' actions on the basis of a perturbed 
payoff function, with beliefs determined by the time average of past play. More specifically, 
the state variable at time t G N is a vector Z f G x iG ^A(Ai), where the zth component 
Zj denotes the time average of player i's past play up to time t. Players' initial choices 
are arbitrary pure strategies; in later periods players best-respond to their beliefs Z f , after 
their payoffs have been subjected to random shocks. That is, for each i G N, let (ef) a ^Ai 
be a vector of payoff disturbances. The vector of payoff disturbances is independent and 
identically distributed across players and over time. Let a_j G x JgAr \{i} A(A,) be a belief. 
The probability that player % chooses action a, G Ai is equal to the probability that 

Ui(a,i, ot_i) + el* > Ui(bi, a_j) + e- 1 

for all hi G A%. Then, the perturbed best-response dynamics associated with Gumbel- 
distributed perturbations with parameter (3 > is: 

exp [f3ui(ai, <x_j)] 



\fi G N, Vaj G Ai : aAai) 



(4.1) 



Gumbel-distributed payoff perturbations correspond to con trol costs of the relative entropy 
form. By Proposition 4.1 of Hofbauer and Sandholm] (j2002 ). the process in (14.11) has a strict 
Lyapunov function that can be expressed in terms of the potential function and the control 
cost functions. For each i G N, let aij denote the probability with which player i chooses 
the action a« = —1. Then, the Lyapunov function for the process in (14.11) is defined by: 



a G x ieN A(Ai 



V{a) 



U(a) ~ -g ^ [oi log(aj) + (1 - Oi) log(l - ai)} 



(4.2) 



i£N 



where U is the potential function. Since control cost fu nctions of the relative ent r opy fo rm 
satisfy the smoothness conditions of Proposition 4.2 of iHofbauer and Sandholml (120021 ) . it 
follows that: 

Proposition 4.1. The collection of stationary states and recurrent points of the process 
in (14.11) coincide. 
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Theorem 6.1 (iii) of iHofbauer and Sandholml (120021 ) now implies that the perturbed 
best-response dynamic converges to these stationary states. Notice that the set of sta- 
tionar y states coincides with the se t of logit quantal response equilibria of the minority 
game (jMcKelvey and Palfreyl . 119951 ). When the perturbation terms go to zero, we obtain 



Nash equilibria. As the set of Nash equilibria is not finite, we cannot apply Corollary 
6.6 of iBenaiml (119991 ) to characterize the subset of Nash equilibria to which the stochastic 



process (14. ip converges. The set of Nash equilibria that are the limit points of a sequence 
of logit quantal response equilibria is generally hard to characterize. In the next section, 
we characterize this set for the three-player minority game. 



4.2 Stationary points for the three-player minority game 

Consider the three-player minority game with /_i = / +1 = / strictly decreasing in 
the number of users. As it involves a simple rescaling of functions satisfying [Mon] and 
[Sym], we may without loss of generality set /(2) = and /(l) — /(3) = 1. A potential 
of the game is then given in Figure 14.11 The Nash equilibria of the three-player game 





-1 


+1 




-1 


+1 


-1 


-1 





-1 








+1 








+1 





-1 



Figure 4.1: A potential function of the 3-player minority game 

follow easily from the results in Section 12.21 Throughout this section, Nash equilibria are 
denoted by (p, q, r) e [0, l] 3 , where p, q, r are the probabilities with which player 1, 2, and 
3, respectively, choose —1. Then, the Nash equilibria of the game are (1/2, 1/2, 1/2) and 
(1, 0, A) for some A G [0, 1], and permutations of these. 

Given parameter /3 > 0, the conditions for a logit quantal response equilibrium (QRE) 
become: 

1 

P ~ 1+exp -(3{l-q-rY (4 ' 3) 

^ = tt m v ( 4 - 4 ) 

1 + exp — p(l — p — r) 

r = . (4.5) 

l + exp-/3(l-p-q) 

Given (3 > 0, we denote a logit QRE in which player 1,2 and 3 play —1 with probability 
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p, q, r by (p, q, r, (3). We now characterize the set of Nash equilibria that are the limit of a 
sequence of quantal response equilibria when (3 — > oo. 

Proposition 4.2. Let (p(/3 n ),q((3 n ),r((3 n ), /3 n ) ne ^ be a sequence of logit quantal response 
equilibria: (3 n — > oo and for each n G N, the quadruple (p(/3 n ),q(f3 n ),r(/3 n ), (3 n ) solves 
equations (14.30 - (j4.5j) . A Nash equilibrium (p,q,r) is the limit of such a sequence if and 
only if one of the following conditions hold: 

(a) (p, q, r) is a pure Nash equilibrium, 

(b) (p,q,r) is a Nash equilibrium with exactly one mixer who mixes uniformly, 

(c) (p,q,r) = (1/2,1/2,1/2). 

The proof is in Appendix [Bl Proposition 14.21 thus characterizes the set of stationary points 
of the perturbed best response dynamics (14. ip for the three-player minority game. 



5 Best-reply learning with limited memory 



In this section, we consider discrete time learning models in which players choose best 
replies to beliefs that are supported by observed play in the recent past. We study two such 
model s, the learning model proposed by 
( 20051 ). First, in the learning model of 



iurkensl (119951) and the model of 



Kets and Voorneveld 



Hurkensl . players may choose any action that is a 



best reply to some belief over other players' actions that is consistent with the ir recent 



Hurkens 



past play. The limiting behavior of this learning process is easy to characterize, 
shows that the Markov proc esses defined by hi s learn ing process eventually settle down in 
so-called minimal curb sets (jBasu and Weibulll . Il99ll ). Minimal curb sets are product sets 
of pure strategies containing all best responses against beliefs restricted to the recommen- 
dations to the remaining players. U nfortunately, this does no t provi de a sharp prediction 
in the minority game. As shown by iTercieux and Voorneveldl (120051 ). the unique minimal 
curb set in the minority game consists of the entire strategy space. That is, over time, all 
players will keep on choosing both act ions. 



Secondly, w e study the model of iKets and Voorneveldl (120051 ). As in the model of 
Hurkens ([l995k it is assumed that players best-respond to beliefs over others' play sup- 
ported by recent past play. In addition, players display a so-called recency bias: when there 
are multiple best replies to a given belief, a player chooses the best reply that he most re- 
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cently playedjfl IrCets and Voorneveld show that play converges to one of the minima l prep 
sets of the game under this learning process. Minimal prep sets (jVoorneveldl . 12004] ) are a 
set-valued solution concept for strategic games that combines a standard rationality con- 
dition, stating that the set of recommended strategies to each player must contain at least 
one best reply to whatever belief he may have that is consistent with the recommendations 
to the other players, with players' aim at simplicity, which encourages them to maintain 
a set of strategies that is as small as possible. Think of the set of recommendations to a 
player in a minimal prep set as a well-packed suitcase for a holiday: you want to be pre- 
pared for different kinds of weath er, but bringing all five o: 



your umbrellas and all seven 
bathing suits may be overdoing it. iTercieux and Voorneveldl (120051 ) show that the minimal 



prep sets of the minority ga me and the pure Nash equilibria of the game coincide. Hence, 
under the learning model of Kets and Voorneveldl . play in the minority game converges to 
one of the pure Nash eq uilibria of the g ame. 



In both the model of 



Hurkens 



Jl995l ) 



and 



Kets and Voorneveldl (120051 ). players need to 



recall a sufficiently long period of play in order for play to converge. We now turn to the 
question what this lower bound on players' memory is. More specifically, suppose players 
remember actions that were chosen during the past T G N periods. A memory length 
of T = 1 is clearly insufficient for a best-reply learning process with limited memory to 
converge. If players chose an action profile yesterday that is not a pure Nash equilibrium, 
then some action, say —1, was chosen by more than k + 1 players. Hence, everyone 
chooses the unique best reply +1 today, and consequently the unique best reply —1 to this 
tomorrow, and the unique best reply +1 to this the day after tomorrow, with action profiles 
forever cycling between these two extr emes. However, we show that a memory length T = 2 
is sufficient for the learning process of iKets and Voorneveldl (120051 ) to convergence to pure 
Nash equilibria. 

When the memory length T is equal to 2, the process is a Markov chain with state 



space H = {(a 1 , a 2 



a 1 , a 2 



G A 2k+1 }, where a history h 



(a 1 , a 2 ) G H indicates that 



the 2k + 1 players remember that they chose action profile a 1 one period ago and a 2 two 
periods ago. Having defined the set H of states, we proceed to the transition probability 
functions P : H x H — > [0, 1], where P(h,h') G [0,1] is the probability of moving from 
state h G H to state h' G H in one period and ^2 h r eH P{h, h') = 1 for all h G H . We do 
not need to specify exact probabilities: for the convergence result, only sign restrictions 



3 The behavioral economics literature provides several motivations for the common observation that 
agents appear somewhat unwillin g to deviate from their recent choices . This can be attributed to e. g. the 
formation of habits (cf. lYoungl . Il998) or the use of rules of thumb (cf. Ellison and Fudenbergl . I1993T ) . 
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are needed. 

Moving from h = (a 1 , a 2 ) to hi = (b 1 , b 2 ) in one period means that h! is obtained from 
h after one more round of play, i.e., by appending a new profile of most recent actions. 
Formally: 

[PI] h! = (b 1 , b 2 ) is a successor of h = (a 1 , a 2 ), i.e., b 2 = a 1 . 



Moreo ver, by moving from h — (a , a ) to h! = (b , b ), the processes in 



Kets and Voorneveld 



( 120051 ) require that each player i G N chooses a best reply to a belief a_j G Xj 6 jv\{i}A.({a], a|}) 
with support in the product set of actions chosen in the previous T = 2 periods, whenever 
possible sticking to the most recent best reply. In games with just two actions, the latter 
simply means that you continue playing as you did in the previous round, unless that 
action is no longer a best reply to your current belief. Formally: 

[P2] For each i G N, b\ is a best reply to some belief a_j G x jeAr \{i}A({a], a 2 }). Moreover, 
b\ = a] if and only if a\ is a best reply to 

Proposition 5.1. Consider a Markov chain on H with transition probability function P, 
where, for all states h, h! G H , it holds that P(h, h') > if and only if [PI] and [P2] are 
true. This Markov process eventually settles down in a pure Nash equilibrium. 

Proof. Let ho = (a 1 , a 2 ) G H and distinguish two cases: 

Case 1: a 1 is a pure Nash equilibrium. By [P2], the players will react with positive 
probability to the belief that everybody plays as in a 1 . Each player's most recent best 
reply is to continue playing as in a 1 , so the process moves with positive probability to the 
history hi = (a 1 , a 1 ). From here on, the only feasible belief based on the past two periods 
is that the players play a 1 and the most recent best reply implies that they will continue to 
play a 1 : the process stays in state hi and play has converged to a pure Nash equilibrium. 
Case 2: a 1 is not a pure Nash equilibrium. By Proposition [27TJ some alternative, 
w.l.o.g. —1, was chosen by a set S C iV of players with \S\ > k + 1. Each player's unique 
best response to a 1 is therefore to choose +1. By [P2], the process moves with positive 
probability to state hi = ((+1, . . . , +1), a 1 ). Let a* G A 2k+1 be a pure Nash equilibrium 
where k + 1 members of S choose +1 and the others choose — 1. Again using [P2], the 
process moves with positive probability from hi to hi = (a*, (+1, . . . , +1)): 

• For each of the selected k + 1 members of S, +1 is the unique best reply to the belief 
drawn from the past two periods that at least k + 1 other players from S will choose 
-1. 
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• For each of the remaining k players, —1 is the unique best response to the belief that 
all other players will continue to play last period's profile (+1, . . . , +1). 

Notice that history /12 belongs to case 1. 

Conclude that, regardless of the initial state ho, the Markov process moves with posi- 
tive probability to an absorbing state where the players continue to play one of the game's 
pure Nash equilibria. As the Markov process is finite and the initial state was chos en ar- 
bitrarily, this will eventually happen with probability one ( iKemeny and Snelll . Il976l ): play 
eventually settles down in a pure Nash equilibrium. □ 



Some remarks are in order. First, notice that, due to the symmetry of the minority game, 
a minor revision of the proof indicates that convergence to pure Nash equilibria can be 
established also if the only thing players remember from the past two periods is what 
they chose themselves and how many others did so. This comes at the exp e nse o f a more 
complex notation and a larger deviation from that of iKets and Voorneveldl (120051 ) . 

Secondly, the result that the lower bound on players' memory l e ngth is two indi- 
cates that the requirement on memory length in Kets and Voorneveldl ( 2005 ) for general 



ga mes can be decreased signifi cantly in specific cases. Although the convergence result 

(120051 ) for the entire class of finite strategic games also applies 



Kets and Voornevelc 



here, we include an explicit proof: the structure of a minority game allows us to give a 
considerably shorter proof of the convergence result for this specific game, and allows us 
to derive a much sharper bound on the memory length. 



6 Concluding remarks 

Though congestion games are apparently simple, game-theorists' understanding of play 
in such games is far from complete, for two reasons. Firstly, well-known learning mod- 
els do not always provide equivocal predictions for such games. In this paper, we have 
characterized the Nash equilibria and the limiting behavior of several well-known learning 
models in a simple congestion game. We show that these learning models provide different 
predictions. Secondly, experimental results are not always in line with theoretical predic- 
tions. In experiments on market entry games, aggregate play is largely consistent with 
equilibrium play, with the number of entr ants c l ose to capacity, but individual play gener- 
ally does not resemble Nash play (see e.g. lOchd . Il999l ). Hence, an interesting direction for 
future research would be to test behavior in minority games experimentally. This provides 
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the opportunity to compare the performance of different learning models in the minority 
gameo Moreover, it may help to better understand behavior in other congestion games 
such as the market entry games, as the symmetry of the game makes it harder for players 
to play repeated-game strategies. In experiments on the (asymmetric) market entry games, 
players sometimes seem to follow such strategies, with some players choosing to enter the 
market in every round i n the initial periods , regar dless of payoffs, to obtain a reputation 
for always entering (see iDuffy and Hopkins! . 120051 . for a discussion). Such strategies are 
useless in the minority game, so that it may be hoped that the minority game offers a 
cleaner test of the theory. 



Appendix A Stochastic dominance for binomial dis- 
tributions 

Let X have a binomial distribution with nsN draws and success probability p e [0, 1]; 
briefly, a B(n,p) distribution: X = X % + • • • + X n , where X 1 , . . . ,X n are i.i.d B(l,p). 
Dis tribut i ons w ith a higher success rate p stochastically dominate those with a lower one 
(cf. Rossi . 1996 . Exc. 9.9). Formally, in terms of cumulative distributions, if p, q G [0, 1] 
and p < q, then 



For all m e {0, . . . ,n} : ( zj^ 1 

fc=0 ^ ' 



p) n - k >Y.ViM K ^ 



k=0 



q) 



n—k 



with strict inequality if m < n. This follows by substitution if m = or m 
m E {1, . . . ,n — 1}. It suffices to show that the function 



n. So let 



m f \ 

[o,i]3p^^r)/(i- P )«- fe 

k=o ^ ' 



iBottazzi and Devetad (|2007l ) and lChmura and Pita (|2006l ) present experiments on the minority game. 
However, their results cannot be directly used to compare the performance of different learning models, as 
they do not test explicitly whether play converges to particular strategy profiles or to particular product 
sets of actions. Both papers merely study the effect of information on players' aggregate payoffs. 
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has a negative derivative on (0, 1). The derivative, after rewriting, becomes 
(?) [kp k -\l-p) n - k - (n- fc)p fe (l -p)"-*- 1 ] 

fc=0 ^ ' 



E 

fc=0 
m 

E 

n 



p* _1 (l -p)""^ 1 [Jfe-np] 



m— 1 

E 

fc=0 



^(l-j))"- 1 - 1 -^ 

fc=0 



p fc (l -p)""*- 1 



n — 1 



p fc (l-p) T 



n x ^ 

1 _ r, 



fc=0 



P fe (l-p) 



n—k 



11 



1 — p 



'n-1 



P ^X fc <m-1 -P ^X fc < 



,it=i 



,fc=i 



Consider the term in square brackets. The first probability is strictly smaller than the 
second, as the first event (at most m — 1 successes in the first n — 1 draws) implies the 
second one (at most m successes during all n draws), whereas the latter also includes the 
positive-probability event that Y^k=i -^k = m. Hence, the derivative is negative, as we had 
to show. 

Write a function g : {0, 1, . . . , n} — > M as the sum of indicator functions: 

9 = g{n)l {0 ,...,n} + {g{n - 1) - 3(n))I { o,.,n-i} + • • • + (<?(0) - </(l))I {0 } 

n-1 

= g(n)I {0 ,...,n} + J2^( k ) ~ 9{k + l))I{o,...*}- 



fc=0 



Then 



n-1 



E[g(X)\ = g(n) + J2(9(k) - g(k + 1))P(X < k). 



k=0 

If g is nonconstant, nonincreasing, then g(k) — g(k + 1) > for all k = 0, . . . , n — 1, with 
at least one strict inequality. As shown above, the cumulative probabilities are strictly 
decreasing in the success probability p. So E[g(X)] becomes a strictly decreasing function 
of p: the higher the probability of success, the larger the probability that g(X) achieves a 
low value. Of course, for nondecreasing functions the converse holds. 



Appendix B Proof of Proposition 14.2 



The only Nash equilibria not covered by (a), (b), and (c) are those with one player 
(w.l.o.g. player 1) choosing —1, one player (w.l.o.g. player 2) choosing +1, and the third 
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player (w.l.o.g. player 3) mixing with probability A e (0, 1) \ {|}. 

Suppose, to the contrary, that such an equilibrium is the limit of a sequence of logit 
QRE (p{(3 n ),q((3 n ),r((3 n ),(3 n ) n& where (3 n -> oo and (p(/3 n ), q(p n ), r(/3 n ), /3 n ) solves equa- 
tions (14 .3p to (14 .5p for a logit QRE. In the selected equilibrium, bo th the (— l)-pl ayer and 
the (+l)-player choose their unique best response. By Lemma 3 in iTurocyl (120051 . p. 251), 
(3 n (l — p(/3 n )) — > and /3 n q(/3 n ) — > 0. Substituting this in the logit QRE condition (14. 5f) 
for the third player gives that 

r(/?n) = l + exp-[3 n (l-p((3 n )-q((3 n )) ^ 2' 
contradicting the assumption that lim^oo r((3 n ) = A ^ 1/2. 

It remains to show that the classes of equilibria in the proposition are indeed limits of a 
sequence of logit QREs. 

(a): By symmetry, it suffices to show that the pure Nash equilibrium (p, q, r) = (1, 1, 0) is 
the limit of a sequence of logit QREs. 

Step 1: For each (3 > 4 there is a logit QRE (p,q,r,j3) with p = q e (1/2, 1), and r < 1/2. 
Proof of Step 1: Based on conditions (14. 3 p - (14. 5 p for a logit QRE and the substitution 
p = q, define for all (3 > and p G [1/2, 1]: 

r{p,(3) := l + exp[-/3(l-2p)]' 
/(P,0):= ' 



l + exp[-/3(l-p-r(p,/3))]' 
Let /? > 4. We show that there is a solution p* e (1/2,1] to the equation p = f(p,/3). 
Substitution in (TOD - dSD yields that (p,q,r,p) = (p* ,p* ,r(p* , (3), (3) is a logit QRE with 
the desired properties. Notice that 

df(p,P) _ -/?exp(-/?(l-p-r(p,/?))) ^ | 9r(p,/3) 



dp (i +eX p(-/5(l-j9-r(p,/?)))) 2 V <9P 

-/3exp(-/3(l-p-r(p,/3))) / -2/3exp - 2p)) 



;i + exp (-/3(1 - p - r(p, /3 )))) 2 V (1 + exp (-(3(1 - 2p))f 



1 + 



Since /(l/2,/3) = 1/2 and 

9/(1/2,/?) 0/2-/3 



9p 4 V 2 " ' 

for (3 > 4, it follows that f(p,(3) > p for p slightly larger than 1/2. Moreover, f(l,/3) < 1. 
Hence, by the Intermediate Value Theorem applied to /(-,/3), f(p*,/3) = P* for some 
p*G (1/2,1). 
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Step 2: Let Po > 4 and let po G (1/2, 1) solve f(po,Po) — Po- This is possible by Step 1. 
The function f(po, •) is strictly increasing on [/3 , oo). 

Proof of Step 2: By definition of /, it suffices to show that the derivative of 

P^P(l-p -r(p ,P)), Pe[Po,oo) 

is positive. This derivative equals 

Q dr(pp,p) 

-P — op — + l-po~r(po,(3). (B.l) 

Using p > 1/2 and the definition of r, it follows that dr(p , /3)/dj3 < 0, i.e., the function 
r(p , •) is strictly decreasing on [Po, oo). Moreover, as po = f(po,Po) > 1/2, it follows from 
the definition of / that 1 — po — r(p , Po) > 0. As r(po, •) is decreasing, this implies that 
1 — Po — r (po, P) > for each j3 G [Po, oo). Therefore, the expression in ( IB. If) is positive. 
Step 3: The pure Nash equilibrium (p, q, r) = (1, 1, 0) is the limit of a sequence of QREs. 
Proof of Step 3: Let Po > 4 and consider a QRE (po,Qo, r o, Po) as in Step 1. Set 
P 1 = p + 1. By Step 2, p = f( Po ,Po) < f(po,Pi)- Moreover, f(l,p 1 ) < 1. By the 
Intermediate Value Theorem applied to the function f(-,Pi), there is a p\ G (po, 1) with 
Pi = f(pi,Pi)- Conclude that there is a QRE (pi, qi, r%, Pi) with 

Pi = Qi = fiPuPi) > Po, 

n = r(pi,Pi) 

Pi = Po + 1 

Repeating this construction allows us to define a sequence (p n , q n , r n , P n ) n ^n of solutions 
to (14. 31) - (14. 51) satisfying the conditions of Step 1 and with P n — > oo and (p n )nen strictly 
increasing. 

As (p n , q n , r n ) ngN is a sequence in the compact strategy space, we m ay assume w.l.o.g. 



that the sequence converges. Its limit (p, q, r) must be a Nash equilibrium (IMcKelvey and Palfrey! . 



19951 ). As (p„) ne N is a strictly increasing sequence in (1/2, 1) and (r n )„ gN is a sequence in 
(0, 1/2), it must be p = q > 1/2 and r < 1/2. The only Nash equilibrium of the game with 
these properties is (p, q, r) = (1, 1, 0). 

(b): By symmetry, it suffices to show that the Nash equilibrium (p,q,r) = (1,0,1/2) is 
the limit of a sequence of logit QREs. The steps are similar to those in (a). Therefore, the 
proof is kept short. 

Step 1: For each (3 > 4 there is a logit QRE (p, q, r, P) with p G (1/2, 1), q = 1 — p, r = 1/2. 
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Proof of Step 1: Let (3 > 4. Based on the substitution q = 1—p and r = 1/2 in condition 
(OP for a logit QRE, define 



1 + exp [(3(1/2 -p)Y 

We show that there is a solution p* e (1/2, 1) to the equation p = g(p, ft). Substitution in 
P~SJ) - (@3D yields that (p, g, r, /?) = (p*, 1 - p*, 1/2, (3), f3) is a logit QRE with the desired 
properties. Notice that 

dg(p,P) (3ex V [(3(l/2-p)] 



dp (l + exp/?(l/2-p)) 2 ' 
Since g{l/2,(3) = 1/2 and dg{l/2, (3) /dp = (3 /4 > 1, it follows that g(p,(3) > p for p 
slightly larger than 1/2. Moreover, g(l,f3) < 1, so the Intermediate Value Theorem implies 
that g(p*,/3) = p* for some p* G (1/2,1). 

Step 2: For each p G (1/2, 1), the function g(po, ■) is strictly increasing on (0, oo). 
Proof of Step 2: Immediate from the definition of g. 

Step 3: The Nash equilibrium (p, q,r) = (1,0,1/2) is the limit of a sequence of logit 
QREs. 

Proof of Step 3: Reasoning as in the proof of step 3 in part (a) allows us to construct 
a sequence (p f3 n )n<m of solutions to ( 14.31) - (14.51) satisfying the conditions of Step 

1 and with (3 n — > oo and (p n ) ne N strictly increasing. As (p n , q n , r n ) neN is a sequence in 
the compact strategy space, we may assu me w.l.o.g. that the sequen ce converges. Its 
limit (p,q,r) must be a Nash equilibrium ( IMcKelvey and Palfreyl . Il995l ). As (p n )neN is a 
strictly increasing sequence in (1/2, 1), q n = 1 — p n and r n = 1/2 for all n G N, it must be 
p > 1/2, q — 1 — p, r = 1/2. The only Nash equilibrium of the game with these properties 
is (p,g,r) = (1,0,1/2). 

(c): It follows by substitution that (p,q,r,(3) = (1/2,1/2,1/2,/?) is a logit QRE for all 
(3 > 0. Consequently, the Nash equilibrium (p, q, r) = (1/2,1/2,1/2) is the limit of a 
sequence of logit QREs with (3 — > oo. 
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